high dimensional linear regression
High Dimensional Linear Regression using Lattice Basis Reduction
We consider a high dimensional linear regression problem where the goal is to efficiently recover an unknown vector \beta^* from n noisy linear observations Y=X \beta^ . Instead we adopt a regularization based on assuming that the underlying vectors \beta^* have rational entries with the same denominator Q. We call this Q-rationality assumption. We propose a new polynomial-time algorithm for this task which is based on the seminal Lenstra-Lenstra-Lovasz (LLL) lattice basis reduction algorithm. We establish that under the Q-rationality assumption, our algorithm recovers exactly the vector \beta^* for a large class of distributions for the iid entries of X and non-zero noise W. We prove that it is successful under small noise, even when the learner has access to only one observation (n=1). Furthermore, we prove that in the case of the Gaussian white noise for W, n=o(p/\log p) and Q sufficiently large, our algorithm tolerates a nearly optimal information-theoretic level of the noise.
Reviews: High Dimensional Linear Regression using Lattice Basis Reduction
The paper presents a novel method of exactly recovering a vector of coefficients in high-dimensional linear regression, with high probability as the dimension goes to infinity. The method assumes that the correct coefficients come from a finite discrete set of bounded rational values, but it does not - as is commonplace - assume that the coefficient vector is sparse. To achieve this, the authors extend a classical algorithm for lattice basis reduction. Crucially, this approach does not require the sample size to grow with the dimension, thus in certain cases the algorithm is able to recover the exact coefficient vector from just a single sample (with the dimension sufficiently large). A novel connection between high-dimensional linear regression and lattice basis reduction is the main strength of the paper.
Memorize to Generalize: on the Necessity of Interpolation in High Dimensional Linear Regression
Cheng, Chen, Duchi, John, Kuditipudi, Rohith
We examine the necessity of interpolation in overparameterized models, that is, when achieving optimal predictive risk in machine learning problems requires (nearly) interpolating the training data. In particular, we consider simple overparameterized linear regression $y = X \theta + w$ with random design $X \in \mathbb{R}^{n \times d}$ under the proportional asymptotics $d/n \to \gamma \in (1, \infty)$. We precisely characterize how prediction (test) error necessarily scales with training error in this setting. An implication of this characterization is that as the label noise variance $\sigma^2 \to 0$, any estimator that incurs at least $\mathsf{c}\sigma^4$ training error for some constant $\mathsf{c}$ is necessarily suboptimal and will suffer growth in excess prediction error at least linear in the training error. Thus, optimal performance requires fitting training data to substantially higher accuracy than the inherent noise floor of the problem.
The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R
Li, Xingguo, Zhao, Tuo, Yuan, Xiaoming, Liu, Han
This paper describes an R package named flare, which implements a family of new high dimensional regression methods (LAD Lasso, SQRT Lasso, $\ell_q$ Lasso, and Dantzig selector) and their extensions to sparse precision matrix estimation (TIGER and CLIME). These methods exploit different nonsmooth loss functions to gain modeling flexibility, estimation robustness, and tuning insensitiveness. The developed solver is based on the alternating direction method of multipliers (ADMM). The package flare is coded in double precision C, and called from R by a user-friendly interface. The memory usage is optimized by using the sparse matrix output. The experiments show that flare is efficient and can scale up to large problems.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Hong Kong (0.05)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Maryland > Baltimore (0.04)